596 research outputs found
Towards Data Optimization in Storages and Networks
Title from PDF of title page, viewed on August 7, 2015Dissertation advisors: Sejun Song and Baek-Young ChoiVitaIncludes bibliographic references (pages 132-140)Thesis (Ph.D.)--School of Computing and Engineering. University of Missouri--Kansas City, 2015We are encountering an explosion of data volume, as a study estimates that data
will amount to 40 zeta bytes by the end of 2020. This data explosion poses significant
burden not only on data storage space but also access latency, manageability, and processing
and network bandwidth. However, large portions of the huge data volume contain
massive redundancies that are created by users, applications, systems, and communication
models. Deduplication is a technique to reduce data volume by removing redundancies.
Reliability will be even improved when data is replicated after deduplication.
Many deduplication studies such as storage data deduplication and network redundancy
elimination have been proposed to reduce storage consumption and network
bandwidth consumption. However, existing solutions are not efficient enough to optimize
data delivery path from clients to servers through network. Hence we propose a holistic
deduplication framework to optimize data in their path. Our deduplication framework
consists of three components including data sources or clients, networks, and servers. The
client component removes local redundancies in clients, the network component removes
redundant transfers coming from different clients, and the server component removes redundancies
coming from different networks.
We designed and developed components for the proposed deduplication framework.
For the server component, we developed the Hybrid Email Deduplication System
that achieves a trade-off of space savings and overhead for email systems. For the client
component, we developed the Structure Aware File and Email Deduplication for Cloudbased
Storage Systems that is very fast as well as having good space savings by using
structure-based granularity. For the network component, we developed a system called
Software-defined Deduplication as a Network and Storage service that is in-network deduplication,
and that chains storage data deduplication and network redundancy elimination
functions by using Software Defined Network to achieve both storage space and network
bandwidth savings with low processing time and memory size. We also discuss mobile
deduplication for image and video files in mobile devices. Through system implementations
and experiments, we show that the proposed framework effectively and efficiently
optimizes data volume in a holistic manner encompassing the entire data path of clients,
networks and storage servers.Introduction -- Deduplication technology -- Existing deduplication approaches -- HEDS: Hybrid Email Deduplication System -- SAFE: Structure-aware File and Email Deduplication for cloud-based storage systems -- SoftDance: Software-defined Deduplication as a Network and Storage Service -- Moblie de-duplication -- Conclusion
Supervised Contrastive ResNet and Transfer Learning for the In-vehicle Intrusion Detection System
High-end vehicles have been furnished with a number of electronic control
units (ECUs), which provide upgrading functions to enhance the driving
experience. The controller area network (CAN) is a well-known protocol that
connects these ECUs because of its modesty and efficiency. However, the CAN bus
is vulnerable to various types of attacks. Although the intrusion detection
system (IDS) is proposed to address the security problem of the CAN bus, most
previous studies only provide alerts when attacks occur without knowing the
specific type of attack. Moreover, an IDS is designed for a specific car model
due to diverse car manufacturers. In this study, we proposed a novel deep
learning model called supervised contrastive (SupCon) ResNet, which can handle
multiple attack identification on the CAN bus. Furthermore, the model can be
used to improve the performance of a limited-size dataset using a transfer
learning technique. The capability of the proposed model is evaluated on two
real car datasets. When tested with the car hacking dataset, the experiment
results show that the SupCon ResNet model improves the overall false-negative
rates of four types of attack by four times on average, compared to other
models. In addition, the model achieves the highest F1 score at 0.9994 on the
survival dataset by utilizing transfer learning. Finally, the model can adapt
to hardware constraints in terms of memory size and running time
SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap
Inspired by the great success of language model (LM)-based pre-training,
recent studies in visual document understanding have explored LM-based
pre-training methods for modeling text within document images. Among them,
pre-training that reads all text from an image has shown promise, but often
exhibits instability and even fails when applied to broader domains, such as
those involving both visual documents and scene text images. This is a
substantial limitation for real-world scenarios, where the processing of text
image inputs in diverse domains is essential. In this paper, we investigate
effective pre-training tasks in the broader domains and also propose a novel
pre-training method called SCOB that leverages character-wise supervised
contrastive learning with online text rendering to effectively pre-train
document and scene text domains by bridging the domain gap. Moreover, SCOB
enables weakly supervised learning, significantly reducing annotation costs.
Extensive benchmarks demonstrate that SCOB generally improves vanilla
pre-training methods and achieves comparable performance to state-of-the-art
methods. Our findings suggest that SCOB can be served generally and effectively
for read-type pre-training methods. The code will be available at
https://github.com/naver-ai/scob.Comment: ICCV 202
Network-Based Protein Biomarker Discovery Platforms
The advances in mass spectrometry-based proteomics technologies have enabled the generation of global proteome data from tissue or body fluid samples collected from a broad spectrum of human diseases. Comparative proteomic analysis of global proteome data identifies and prioritizes the proteins showing altered abundances, called differentially expressed proteins (DEPs), in disease samples, compared to control samples. Protein biomarker candidates that can serve as indicators of disease states are then selected as key molecules among these proteins. Recently, it has been addressed that cellular pathways can provide better indications of disease states than individual molecules and also network analysis of the DEPs enables effective identification of cellular pathways altered in disease conditions and key molecules representing the altered cellular pathways. Accordingly, a number of network-based approaches to identify disease-related pathways and representative molecules of such pathways have been developed. In this review, we summarize analytical platforms for network-based protein biomarker discovery and key components in the platforms
- …